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Abstract In order to test if an unknown matrix has a given rank (null hypothesis), we 
consider the family of statistics that are minimum squared distances between an estimator 
and the manifold of fixed-rank matrix. Under the null hypothesis, every statistic of this 
family converges to a weighted chi-squared distribution. In this paper, we introduce the 
constrained bootstrap to build bootstrap estimate of the law under the null hypothesis of 
such statistics. As a result, the constrained bootstrap is employed to estimate the quantile 
for testing the rank. We provide the consistency of the procedure and the simulations 
shed light one the accuracy of the constrained bootstrap with respect to the traditional 
asymptotic comparison. More generally, the results are extended to test if an unknown 
parameter belongs to a sub-manifold locally smooth. Finally, the constrained bootstrap is 
easy to compute, it handles a large family of tests and it works under mild assumptions. 
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1 Introduction 



Let Mq G M. pxH be an unknown matrix (arbitrarily p < H). To infer about the rank of Mq 
with hypothesis testing, the general framework usually considered is the following: there exists 
00 ■ an estimator M G K pxfl of Mo such that 

^ . 

^ ■ n 1/2 (M - M ) ^ W, with vec (W) = M(0, V) (1) 

where vec(-) vectorizes a matrix by stacking its columns. In the whole paper the hatted quan- 
. tities are random sequences that depends on the sample number n, all the limit are taken with 

respect to n. Moreover there exists an estimator T such that 



X 

and in some cases, one may ask that 



(2) 



r is full rank. (3) 

Let do be the rank of Mo and m £ {1, ...,p}, we consider the set of hypotheses 

Hq : do = m against H\ : do > m, (4) 

Thus do can be estimated the following way: we start by testing m = 0, if Ho is rejected we go 
a step further m := m + 1, if not we stop the procedure and the estimated rank is d = m. In 
this paper, by considering the hypotheses (jU) we focus on each step of this procedure. 

Many different statistical tests appeared in the literature for this purpose. For instance 
Cragg and Donald introduced a statistic based on the LU decomposition of M, Kleibergen 
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and Paap [TO] studied the asymptotic behaviour of some transformation of the singular values 
of M, and Cragg and Donald [12] considered the minimum of a squared distance under rank 
constraint. In some other fields with similar issues, close ideas have been developed : Bura and 
Yang [7] examined a Wald type statistic depending on the singular decomposition of M and 
Cook and Ni [TO] also considered the minimum of a squared distance under rank constraint. 
Although based on different considerations, each of the previous work relies on the test described 
by dU). For comprehensiveness, in this paper we consider the following three statistics. The 
first one is introduced by Li [21] as 

v 

k=m+l 

where (Ai, \ p ) are the singular values of M arranged in descending order. Under Hq and ([1]), 
this statistic converges in law to a weighted chi-squared distribution [7J. The main drawback 
of such a test is that Ai is not pivotal, i.e. its asymptotic law depends on unknown quantities 
that are Mq and T. Accordingly the consistency of the associated test requires assumptions ([T]) 
and ([2]) . In [7] a standardized version of Ai is studied with 

Ai = n vec(QiMg 2 ) T [(Q 2 ® Qi)T{Q 2 ® Qi)] + vec(QlMQ2) (6) 

where M + stands for the Moore-Penrose inverse of M and Q\ and Q2 are respectively the 
orthogonal projectors on the left and right singular spaces associated with the p — m smallest 
singular values of M. The authors proved that under Hq, if ([1]) and ([2]) hold, the Wald- 
type statistic A2 is asymptotically chi-squared distributed. Besides, [12] and [10] proposed a 
constrained estimator by minimizing a squared distance under a fixed-rank constraint as 

A 3 = n min vec(M — M) T r _1 vec(M — M), (7) 

rank(M)=m 

which is also asymptotically chi-squared distributed under Hq, assuming |1]), ([2]) and ([3]). We 
will refer the minimum discrepancy approach. Although the statistics A2 and A3 have the 
convenience of being pivotal, they both require the inversion of a large matrix and this may 
cause robustness problems when the sample number is not large enough. For a g]0, 1[ and 
under the relevant assumptions, each of these statistics Ai, A2 and A3, is consistent at level a 
in testing dU, i.e. the level goes to 1 — a and the power goes to 1 as n goes to 00. 

Nevertheless the estimation of the quantile is difficult because either the asymptotic distri- 
bution depends on the data (non pivotality represented by Ai), or the true distribution may 
be quite different than the asymptotic one (slow rates of convergence represented by A2 and 
A3). The objective of the paper is to propose a bootstrap method for quantile estimation in 
this context. 

An important remark which instigates the sketch of the paper is that all the previous statis- 
tics share the form 

A = n 1 1 B vec (M-M c )f with M c = argmin \\Avec(M - M)f (8) 

rank(M)=m 

where || • || is the Euclidean norm, A 6 W Hx p h , B G W Hx p h . The values of A and B 
corresponding to the statistics Ai, A2 and A3 are summarized in the Table Q] (See Section [2] for 
the details). 

We refer to traditional testing (resp. bootstrap testing) when the statistic is compared to 
its asymptotic quantile (resp. bootstrap quantile). The bootstrap test is said to be consistent 
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Table 1: Values of A and 5 in © for Ai, A 2 and A 3 



at level a if 

Ptf„ (A > q{a)^j — ► 1 - a and P Hl (A > g(a)) — ► 1, (9) 

where is the quantile of level a calculated by bootstrap. The advantage of bootstrap testing 
is its high level of accuracy under Hq with respect to traditional testing. This fact is emphasized 
by considering the two possibilities: when the statistic is pivotal and when the asymptotic law 
of the statistic depends on unknown quantities. First, as highlighted by Hall [15] . when the 
statistic is pivotal, under some conditions the gap between the distribution of the statistic and 
its bootstrap distribution is Op(n _1 ). Since the normal approximation leads to a difference 
Ofa 1 / 2 ), the bootstrap enjoys a better level of accuracy. Secondly if the asymptotic law of the 
statistic is unknown, the bootstrap appears even more as a convenient alternative because it 
avoids its estimation. In [T7|, Hall and Wilson give two advices for the use of the bootstrap 
testing: 

A) Whatever the sample is under Hq or Hi, the bootstrap estimates the law of the statistic 
under Hq. 

B) The statistic is pivotal. 

The first guideline is the most crucial because if it fails it may lead to inconsistency of the 
test. The second guideline aims at improving the accuracy of the test by taking full advantage 
of the accuracy of the bootstrap. In this paper we propose a new procedure for bootstrap 
testing in least square constraint estimation (LSCE) (estimators as ([8|) are particular cases), 
called constrained bootstrap (CS bootstrap). More precisely, the CS bootstrap aims at testing 
whether a parameter belongs or not to a submanifold and so generalised the test 0. Our 
main result is the consistency of the CS bootstrap under mild conditions. As a consequence 
we provide a consistent bootstrap testing procedure for testing (HJ with the statistic Ai, A 2 
and A3. For the sake of clarity, we address the CS bootstrap in the next section. Section 3 is 
dedicated to rank estimation with special interest to the bootstrap of the statistic Ai, A 2 and 
A3. Finally, the last section emphasizes the accuracy of the bootstrap in rank estimation by 
providing a simulation study in sufficient dimension reduction (SDR). Accordingly, the sketch 
of the paper is as follows: 

• The CS bootstrap in LSCE 

• Bootstrap testing procedure for Ai, A 2 and A3 

• Application to SDR 

2 The constrained bootstrap for LSCE and hypothesis testing 

Because of LSCE has a central place in the paper. Moreover since LSCE intervenes in many 
statistical fields as M-estimation or hypothesis testing, this section is independent from the rest 
of the paper. 
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2.1 LSCE 

Let #o G M p be called the parameter of interest, and let 9 G MP be an estimator of 9q. We define 
the constrained estimator of 9q as 

9 C = argmin [9- 9) T A(9- 9), (10) 

where Ai is a submanifold of M p with co-dimension q, and j4 G M pxp . The constrained statistic 
is defined as 

A = n(0 - C ) T S(0 - ? c ). (11) 

where B G R pxp . Note that if ^4 is full rank, the unique minimizer of (|10p without constraint is 
hence it could be understood as the unconstrained estimator. We introduce now the notion 
of nonsingular point in Ai. This one is needed to express the Lagrangian first order condition 
of the optimization (|10p . For any function g = (g±,. . . ,g p ) : M. p — > M. q , define its Jacobian as 
J g = (Vgi, ...,Vg q ), where V stands for the gradient operator. 

Definition 1. We say that 9 is Ai -nonsingular if 9 G Ai and if there exists a neighbourhood 
V and a function g : M. p — > M. q continuously differ entiable on V with J g {9) full rank such that 

VnM = {g = 0}. 

As a consequence any point of a submanifold locally smooth is nonsingular, e.g. any matrix 
with rank m is a nonsingular point in the submanifold rank(M) = m. We prove in Proposition 

[2] that if #o is Ai -nonsingular, ^/ri(9 — 9q) — > Af(0, A) and B = A — > A is full rank, then we have 

A^f^ 2 , (12) 
fc=i 

where the W^s are i.i.d. Gaussian random variables and the v^s are the eigenvalues of the 
matrix A 1 / 2 J g (9 ) T ( J g (9 )A- 1 J g (9 (s ) T )- 1 J g (9 )A 1 / 2 . Especially, the case A = A" 1 is interest- 
ing because A is asymptotically chi-squared distributed with q degrees of freedom. Otherwise, 
if #o ^ -M.-I A goes to infinity in probability. Those facts shed light on a consistent testing 
procedure based on LSCE with the hypotheses 

H : 9 £M against H x : 9 i M (13) 

and the decision rule to reject Hq if A is larger than a quantile of its asymptotic law. Accordingly 
the previous framework can be seen as an extension of the Wald test statistic which handles 
the simple hypothesis 9q = 9 with the statistic (9 — 9) T A~ 1 (9 — 9). 

2.2 The bootstrap in LSCE 

Since LSCE is a particular case of estimating equation, we review the bootstrap literature with 
two principal directions: estimating equation and hypothesis testing. For clarity we alleviate 
the framework in this section: let X\,--- ,X n be an i.i.d. sequence of real random variables 
with law P, define 7 = var(Xi), 7 = (X - X) 2 , we put 9 = E[Xi], 9 = X, and A = B = 7- 1 
where ~ stands for the empirical mean. 

The original bootstrap was introduced in [T3] in the following way. Let X*, . . . ,X* be 
an i.i.d. sequence of real random variables with law P = n _1 Y^i=i define 9* = X* , the 
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distribution of y/n{9* — 9) conditionally on the sample, that we call the bootstrap distribution, 
is "close" to the distribution of y/n(9 — 9q), that we call the true distribution (in the rest of the 
paper we just say "conditionally" instead of "conditionally on the sample"). For instance, it is 
shown in [23j that the bootstrap distribution converges weakly to the true distribution almost 
surely. One says that sjn{9* — 9) bootstraps y/n{6 — 9q) and we will write 



C 00 {n 1 ' 2 {9* - 9)\P) = C^n^ffi- 9 )) 



a. s. , 



where >C OQ (-) and Coo(-\P) both mean the asymptotic laws with the difference that the later is 
conditional on the sample. Equivalently, one has for every x € R, P(y / n(#* — 9) < x\P) ^4' 
P(^/n(9 — 9q) < x), but the use of the bootstrap is legitimate by a more general results stated 
in [15], which says that 

\¥(n 1 / 2 (9* - 0)/ 7 * < x\P) - ¥(n^ 2 (9- o )/t < x)| = O^n" 1 ) (14) 



with 7* = {X* — X*) 2 , provided that P is non-lattice. Besides, one has 

|P(n x / 2 (0 - o )/7 < x) - $(s)| = P (n-^ 2 ), 

where $ is the cumulative distribution function (c.d.f.) of the standard normal law. Variations 
of Efron's resampling plan are proposed in [2] under the name of weighted bootstrap. For a 
complete introduction about the bootstrap we refer to [15]. We now present three different 
bootstrap techniques r elated to LSCE0. 



(i) The classical bootstrap (C bootstrap) 

The literature about the bootstrap in Z and M-estimation, see respectively and pQ, 

is based on the following principle: if 9m = argmin K[<f>(X, 9)] is estimated by 9m = 

eee 

argmin ^ X^=i <f>(Xi) 6) where G is an open set, then the bootstrap of y/n(9M — 9m) is 
eee 

carried out by the quantity y/n{6* M — 9m) with 

n 

9* M = argmin rT 1 } Wi(f)(Xi, 9), (15) 
eee ^ 

where (wi) is a sequence of random variables. The particular case where the vector 
(wi, . . . , w n ) is distributed as mult(n, (n _1 , . . . , n -1 )) leads to a direct application of orig- 
inal Efron's bootstrap to M-estimation. Since such a bootstrap has been extensively stud- 
ied, we refer to the C bootstrap. To the knowledge of the authors, the C bootstrap when G 
has empty interior has not been studied yet. Nevertheless one may sight its bad behaviour 
for the test of equal mean Hq : 9q = fj,. The associated least squared constrained statistic 

nT\9-^ 2 , 

is indeed the score statistic associated to the M-estimator with (f>(x,9) = 7 _1 (x — 9) 2 and 
G = {fi}. Clearly the C bootstrap through n^*~ 1 {9* — fi) 2 does not work because of its 
bad behaviour under Hi for instance. In this case it is better to use 

n 1 *- 1 {9* -9) 2 , 

A bootstrap with a Delta-method approach (see [23], chapter 23, Theorem 5) fails because x — ¥ min \\x — 9\\ 

ll 8 ll =1 

is not continuously differentiable on the unit circle. 
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but it cannot handle the cases of more involved hypotheseo Whereas the C bootstrap is 
not really connected with hypothesis testing, the two following bootstrap procedures are 
more related to the present work. 



(ii) The biaised bootstrap (B bootstrap) 

The B bootstrap is introduced in |16j and is directly motivated by hypothesis testing. The 
original idea of their work is to re-sample with respect to the distribution Pf, = ^22=1 ^i^Xi > 
where the cjj's maximize 

logfwj) under the constraints n * % , • (16) 

^ Ei=i = 1 

Since the cjj's minimize the Kulback-Leibler distance between P and Pb, one can see the 
resulting distribution as the closest to the original one satisfying the mean constraint. 
The authors presented interesting results for the test of equal mean 9q = /i, essentially the 
bootstrap statistic n r y*~ 1 (9^ l — fj,) 2 , with 9% = X£, X£ { sampled from Pj, has a chi-squared 
limiting distribution either Hq or H\ is assumed. As a result both guidelines (|A|) and ([B]) 
are checked. They go further by showing that the B bootstrap outclasses the asymptotic 
normal approximation for quantile estimation in the sense that \q(a) — q n {a)\ = Op(n _1 ) 
whereas \q n (a) — qoo(a)\ = 0(n~ 1 / 2 ), where qoo, q n and q n are the quantile functions of the 
standard normal distribution, the statistic mr)~ l (9 — /j,) 2 under Hq and the bootstrapped 
statistic, respectively. Although the B bootstrap matches the context of hypothesis testing, 
it has been designed to handle the particular test of equal mean. To the knowledge of the 
authors the study of the B bootstrap has not been extended to other tests. Facing (|16|) . 
the main drawback of the B bootstrap deals with algorithmic difficulties. Indeed when the 
constraint becomes more involved, solving (|16p is more difficult. As a result it is not sure 
that this method could handle other situations such as fixed-rank constraints. 

(iii) The estimating function bootstrap (EF bootstrap) 

Now Xi G W. Some other ideas about the bootstrap of the Z-estimators can be found 
in [2U] and [IS], and can be summarized as follows. Considering the score statistic S = 
\/nY^i=i §f(Ai;#o)> HBJ showed that it could be bootstrapped by 

S* = n^j^ Wi d ^{Xj), 

i=l 

where (u>j) is a sequence of random variables. This bootstrap is called the EF bootstrap 
and revealed nice computational properties. Moreover the authors argued for its use in 
quantile estimation in order to test if g(9o) = 0, where g : MP — > R g is the constraint func- 
tion, by recommending essentially to use S* T J g (6) T (^Jg(O)'J* J g (9) T ^j J g (9)S* . Applying 

it to the least squared context (p(x,9) = ||7 -1 / 2 (x — 9)\\ 2 , the EF bootstrap is carried out 
by 

n(9* - 9) T J g (9) T (j 9 (0) 7 *J s (0) T ) _1 J g 0W ~ 

Although it verifies both guidelines (fA"|) and (|B]) (see the article for details), one can 
see that the good behaviour of such an approach is more based on the rank deficiency 

2 We refer to [T7] for a study of this bootstrap in order to test 8o = fJ,. 
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of J g {9) than on the bootstrap of y/n(0 — 6 C ). Indeed ^/n(9* — 6) bootstraps the non 
constrained estimator ^/n(9 — 9q). Then as the authors noticed, it is first of all a bootstrap 
of the Wald-type statistic nS T J g (9o) T (J 9 (6 , o)7«/ 9 (^o) T ) Jg(9o)S which has fortunately 
the same asymptotic law than the targeted one. This may induce some loss in accuracy. 
Moreover, it requires the knowledge of the function J g which is not the case for fixed rank 
constraints where the g depends on the limit Mq (see Remark [1] for some details). 

Essentially both (jl]) and ([TTJ) provide a bootstrap for testing simple hypotheses. The EF 
bootstrap proposed in (jnrj) extends this limited scope by including tests of the form g(9o) = 
where g is known. Nevertheless it does not handle the test as it is highlighted by the 
following remark. 

Remark 1. Testing @ with A3 results in an optimization with the constraint rank(M) = m. 
Since the subspace of fixed rank matrices is a submanifold locally smooth with co-dimension 
(p ~ d)(H — d), at every point M, there exists a neighbourhood V and a C°° function g : V — > 
^( P -d)(H-d) guch that y n {rank(M) = m} = {g = 0} and J g (M) has full rank. Moreover, we 
have 

||r- 1 / 2 vec(M c -M )|| < 2||r- 1 / 2 vec(M-M )||. 

If now {[TJ) holds, the right-hand side term goes to in probability and M c — > Mq. As a 
consequence, if T is invertible, for any neighbourhood of Mq, from a certain rank, M c belongs 
to it with probability 1. Then under Hq since Mq has rank m the constrained estimator has 
the expression 

M c = argmin||r" 1/2 vec(M c - M)||, 

g(M)=0 

with g depending on Mq. Unfortunately we do not know neither g nor J 9 (Mq). This entails 
some problems relating to the later approach. 

2.3 The constrained bootstrap 

The CS bootstrap is introduced in order to solve all the issues we have raised through the 
previous little review which are essentially: computational difficulties and small scope of the 
existing methods. The CS bootstrap targets an estimation q(a) of the quantile under Hq of 
A. The consistency of the procedure, i.e. ©, forms the main result about the CS bootstrap. 
Another important issue which occurs beforehand in the section is the bootstrap of the law of 

v>l 2 (9 c - 9q) under Hq. 

Basically, we show that a bootstrap of the unconstrained estimator y/n(9—9o) allows a bootstrap 
of the constrained estimator \/n(9 c — 9Q) under Hq. We point out that the CS bootstrap heuristic 
is rather different than the C and EF bootstrap. Otherwise it shares the idea to "reproduce" 
Hq even if Hi is realized with the B bootstrap. Assuming that we can bootstrap y/n{9 — 9q), 
the CS bootstrap calculation of the statistic is realized as follows: 
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The CS bootstrap procedure 






Compute 






0q = 6 c + u 1 W , with Coc 


(Hf |P) = £oo(n i/z (f9 - 0„)) 


a.s., (17) 


where the simulation of W* can be done by a 


standard bootstrap procedure 


1. Calculate 


9* = argmin (9* - 9) T A*(9^ - 6), 


and A* = n(0Q — 9*) T B*( 


91-91), (18) 








where A* £ W x p and B* G W x p%. 







Intuitively, this choice appears natural because 9q equals 9 C plus a small perturbation going 
to 0. Accordingly 9q is somewhat reproducing the behaviour of 9 under Hq, especially because 
W* has the right asymptotic variance. As we should notice, A* and B* could be chosen as A and 
B but this is not the best choice in practice. As it is highlighted in (|14p . we should normalized 
by the associated bootstrap quantities (e.g. the variance computed on the bootstrap sample). 
The following lemma gives a first order decomposition of the bootstrap law y/n(9* — 9 C ) under 
mild conditions. The following lemma is proved in the Appendix. 

Lemma 1. Let Ai be a submanifold. Assume there exists 9 C £ Ai and 9 C a M-nonsingular 

point such that 9 C ^4' 9 C . If moreover £oo(Vn(6o — &c)\P) exists a.s. and conditionally a.s. 

p 

A* — > A is full rank, then we have conditionally a.s. 

n x l\9l - 9 C ) = {I- P)n l ' 2 (9l - 9 C ) + o P (l), 

with P = A' 1 jJ(9 c )(J g (9 c )A~ 1 Jj(Gc)r 1 J g (0c)- 

Note that if #o is .M-nonsingular and C 00 (y/n{9 — 0$)\P) exists, we can apply Lemma [1] with 
9 C = 9 C = 9q. This gives the following proposition: 

Proposition 2. Let Ai be a submanifold. Assume that C OQ {y/n{9 — 9$)\P) exists with 9q AA- 

■T P 

nonsingular. Assume also that A — > A is full rank, then we have 

n^il - 9 ) = (I- P)n l l\9-9 Q ) + o P (l), 
with P = A~ 1 jJ(9 )(J g (9 )A- 1 jJ(9 ))- 1 J g (9 ). 

Proposition [2] leads easily to (fT2j) and extends classical results [6] about constrained esti- 
mators with constraint {g = 0} to manifold type constraints. Besides statements of Lemma [1] 
and Proposition [2] together explain the preceding definition of 9$ in (|17p . They also lead to the 
following theorem. 

^ a s ^ P 

Theorem 3. Let Ai be a submanifold. Assume that 9 9$ with 9q Ai-nonsingular and A — > A 

p 

hold. If moreover \1T\) holds and conditionally a.s. A* — > A is full rank, then we have 

Coc(n 1 / 2 (9* - 9 C )\P) = Cooin 1 / 2 ^ - 9 )) a.s. . 

3 The bootstrap procedure to get W* is not specified because it depends on 8. For instance, if 8 is a mean 
over some i.i.d. random variables, one can use the Efron's traditional bootstrap and if 8 is a M-estimator, one 
should use a bootstrap as detailed by equation (|15|) . 

4 Assumptions about A* and B* are provided further in the statements of the propositions. 
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Essentially, Theorem is an application of Lemma [JJ under Hq, indeed as we seen in the proof 
of Lemma [TJ equation (f24"|) . the assumption 8 ^ 6q £ Ai implies that 8 C ^» 9 C . Nevertheless 
under Hi nothing guarantee such a convergence (see Example [JJ below) . Roughly speaking, 
asking for an equality in law under H\ as in Theorem [3] may be too much to ask. However 
as stated in the following theorem we do not require that 9 C converges a.s. to a constant to 
provide that the power of the corresponding test goes to 1. This leads to the consistency of the 
CS bootstrap for hypothesis testing. For the statement of the consistency theorem, we need to 
define the quantile function of the bootstrap statistic 

q(a) = inf {x : F(x) > 1 — a}, 
where F is the c.d.f. of A* conditionally on the sample. 

Theorem 4. Let Ai be a manifold. Assume that 9 ^4' 9q with 9q Ai -nonsingular under Hq. We 

assume also that A A is full rank, B B. If moreover C 00 {^/n{9Q — 9 c )\P)=C O0 {y/n(9 — 9q)) 

p p 

a.s. has a density, and conditionally a.s. A* — >■ A, B* —> B, then we have 
P# (A > q(a)) — ► 1 - a, and F Hl (A > q(a)) — > 1. 



In other words, the test described in [I3\i with statistic A and CS bootstrap calculation of quantile 
is consistent. 

We provide the following example under H\, where 9 C does not converge to a constant in 
probability. Although we cannot get the conclusion of Theorem [3l the least squared constrained 
statistic still converges in distribution. 

Example 1. Let (_Xj)j e N be a i.i.d. sequence such that Xi = Af(0,l). Define 9 = X, and 
Hq : 9q = 1. Clearly Hq does not hold and naturally the statistic n min||# — 9\\ 2 goes to infinity 

6> 2 =1 

in probability. One can find that 9 C = sign(AT) which does not converge. Since 

9* c = argmin||0o - 0\\ 2 and 9*q = 9 c + n - 1/2 W*, 

2 =1 

we get that 9* = 9 C a.s. and naturally, we do not have the asymptotic given by Theorem [3l 
Besides, the convergence to a chi-squared distribution holds for the quantity n min||#o — 9\\ 2 . 

6)2 = 1 

3 Rank estimation with hypothesis testing 

In this section through a review of the literature about rank estimation, we apply the results 
obtained in section I2TT1 to provide a consistent bootstrap procedure for the test described by ([!]) 
associated with the statistics Ai, A2 and A3. We define qo = p — do the dimension of the kernel 
of Mq. We denote by (Ai, A p ) the singular values of Mq arranged in descending order and 
we write the SVD of Mq as 

with Ui G W xdo , Uq G M^ , Vi G R Hxdo , Vq G R Hx i°, and B x = diag(Ai, X do ). For 
m G {1, • • • ,p}, we note q = p — m and we write the SVD of M as 

M = (U 1 U ) 
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with Ui G W xm , U e M pxq , Vi e M^ xm , V e M Hxq , f> x = diag(Ai, .... A m ) and D = 
diag(A m+ i, A p ). We also introduce the orthogonal projectors 

Q 1 = I-P 1 = IWcf , Q2 = I ~ P2 = V V T , Qx = I - Pi = U U£ and Q 2 = I - P 2 = VqVq T . 

Whereas the link between A3 and LSCE is evident, the one conecting Ai and A2 to LSCE relies 
on the following classical lemma, whose proof is avoided. 

Lemma 5. Let M G W xH , it holds that 

p 

argmin \\M - M||| = P\MP 2 , and \\M - P 1 MP 2 \\ 2 F = ^ \\, 

rank(Af)=m fc=m+l 

where Ai, . . . , A p are the singular values of M arranged in descending order, and P\ and P 2 are 
orthogonal right and left singular projectors of M associated with Ai, . . . , A m . 

Note that in the previous lemma, Pi and P 2 are uniquely determined if and only if A m 7^ 

A m +1- 



3.1 Nonpivotal statistic 

As stated in the introduction, the statistic Ai = n Y%= m +i ^1 can ' 3e use d to arbitrate between 
the hypotheses of Basically, if Hq : do = m is realized, all the eigenvalues of the sum 
goes to and Ai has a weighted chi-squared limiting distribution. Otherwise, at least one 
eigenvalue converges in probability to a positive number and for any A > 0, P(Ai > A) — > 1. 
The following proposition describes the asymptotic behaviour of Ajj. It was stated in [8] and 
some recent extension can be found in [7]. Our statement goes further because we are also 
concerned about the estimation of the asymptotic law of A 1; i.e. the estimation of the weights 
that intervenes in the weighted chi-squared asymptotic law. Besides, the proof we give in the 
Appendix is quite simplqj- 

Proposition 6. Under Hq, if ([TJ) holds we have 

where the Vk 's are the eigenvalues of the matrix {Q 2 Qi)T(Q 2 (8) Q\) and the Wk 's are i.i.d. 
standard Gaussian variables. If moreover (0j holds, we have 

p 

{vi, ...,v p h) — > (vi, -,v p h), 
where the 's are the eigenvalues of the matrix (Q 2 Qi)T(Q 2 ® Qi). 

Remark 2. Unlike Theorem 1 in [8] or Theorem 1 in [7j, we prefer to state this theorem with 
the quantities Q\ and Q 2 rather than with Uq and Vq. Because we do not assume that the 
kernel of M has dimension 1, the vectors that form Uq or Vq are not unique because vector 
spaces with dimension larger than 2 have an infinite number of basis. As a consequence it does 
not make sense to estimate either Uq or Vq. To characterize convergence of spaces, a suitable 
object is their associated orthogonal projectors. 

5 A similar proposition can be stated applying Proposition 1121 Following this way, the asymptotic depends on 
g which is difficult to estimate for rank constraints (see Remark [!}. 

6 We no longer need the results of [O] about the asymptotic behaviour of singular values. 
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In general, we do not know the asymptotic distribution of Ai because it depends on (Q 2 <8> 
Qi)T(Q 2 <S> Qi). On the first hand, one can estimate consistently this matrix to get an approxi- 
mation of the law of Ai under Hq. Some conditions providing the consistency of the estimation 
are stated in Proposition [6l On the other hand, one can apply the CS bootstrap to estimate the 
quantile of Ai in order to test. The main advantage of such an approach is that we no longer 
need to have a consistent estimator of T so that ([2]) is not needed anymore. Following section 
12.11 and by using Lemma [5j we define 

M * = PiMP 2 + n~ 1/2 W* with W*\P A W a.s., (19) 
with W defined in dU). Accordingly, we introduce the CS bootstrap statistic 

k=m+l 

with A£j +1 , A* the smallest singular values of M*. The following proposition is a straightfor- 
ward application of Theorem U] with the submanifold {rank(M) = m}. 

Proposition 7. If (QP, |72J) and M ^4' Mq hold, then the test described in §Jjty with the statistic 
Ai and calculation of quantile with A* is consistent. 

3.2 Wald-type statistic 

The Wald-type statistic A 2 = vec(QiMQ 2 ) T [(Q 2 ® <2i)?(Q 2 ® Qi)] + vec(QiM Q 2 ) has been 
introduced in [7] to get a pivotal statistical They obtained the following theorem for which we 
provide a different proof in the appendix. 

Proposition 8. // and (0j hold, we have 

T d 2 
A 2 — »• X s , 

with s = min(rank(r), (p — d)(H — d)). 

Following (|18p . we define the associated bootstrap statistic by 

A 2 = vec(QlMSQ* 2 ) T [(Q* 2 Ql)T*(Q* 2 ® Qt)]+ vec(QlM$Q* 2 ), 

where M * is defined in (HSJ), T* £ W Hx p h , Q\, and Q* 2 are the eigenprojectors associated with 
the smallest eigenvalues of Mg Mq T and Mq T Mq . As Proposition [71 the following one is an easy 
application of Theorem SJ 

Proposition 9. If CP, W, GW> M ^ M and T* A T hold, then the test described in gp 
with the statistic A 2 and calculation of quantile with A 2 is consistent. 

7 We write the expression of A2 another way for the reasons explained in Remark [2] but one can recover the 
original expression by noting that for any symmetric matrix A, A + H — (AH) + if H is an orthonormal basis of 
a vector subspace of \m(A). 
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3.3 Minimum Discrepancy approach 

Noting that {rank(M) = m} has co-dimension (H — m)(p — m) and applying (|12p we get the 
following proposition^. 

Proposition 10. // |7]), (d|), and (EP hold, we have 



A d 2 

A 3 ► X(fl-- m )(p_ m )- 



In general a minimizer 



M c = argmin vec(M - M) T T~ l vec(M - M) 

rank(M)=m 

does not have an explicit form as it was for the constrained matrix associated with Ai and A2. 
Therefore, we define 

Mo = M c + n~ 1/2 W* with W*\PAw a.s., (20) 
where W is defined in ([1]). We also define the associated CS bootstrap statistic 
A3 = re min vec(M * - M) T r*" 1 vec(M * - M), 

rank(M)=m 

and applying Theorem 0] we have the following result. 

Proposition 11. // (Q), (GJ>, (GJ), T* V, and M ^ M hold, then the test described in 

with the statistic A3 and calculation of quantiles with A3 is consistent. 

Remark 3. The set of assumptions needed to obtain Proposition [10] is stronger than the ones 
stated in propositions [6] and [8] ensuring the convergence of Ai and A2. As a consequence this 
is also true for Proposition [11] with respect to propositions [7] and [9] The main difference is 
that we add the assumption on T to be non deficient. This assumption cannot be alleviated 
in the statement but is not as restrictive in practice. On the first hand, if T is deficient the 
optimization under constraint has a free coordinate which implies the non-convergence of the 
minimizer. On the other hand, because of the semi-definite character of T the projection of M 
on the null space of T is null. Then one can apply the proposition to the restriction of M on 
the range of T. This is the case in the application to SDR in Sectional 

Remark 4. Unlike the situation of Ai and A2, an optimization algorithm is needed to obtain 
A3 and A3, this point out an important issue of such a procedure. In |10j . the authors noticed 
that 

A 3 = re min (vec(M) - vec(AB)) T f _1 (vec(M) - vec(AB)) 

AeH d ,BeR dxl 

where is the set of orthogonal basis lying in W with dimension d. We follow their algorithm 
in the computation of A3 (see [TU], Section 3.3 for the details). 



3 See [12] for the original proof. 
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3.4 The statistics A l5 A 2 , A 3 through an example 

In the introduction, we already mentioned several drawbacks and advantages of the use of Ai, 
A2, or A3. The remark relied on both pivotality of the statistics and large matrix inversion. 
Here we develop another point of view related to the algebraic nature of the statistics. Facing 
the representation provided by Table [TJ each statistic Ai and A2 evaluates a different distance 
between M and M c . The first one is the distance that is optimized, but the second is another 
one. This has raised the issue we present here through the following example. For the sake of 
clarity, we consider 




M - I ~ j with A fc = - ^Afe,i, for k = 1,2, and (Xk,i)k,i i-i.d., 



and we test Hq : do = 1 against H\ : do > 1. We assume that Ai > A2, we have Ai = n\\. 



Otherwise, one can show that A2 = + op(l), with Vk = (Afc — \k) ■ For A3 it is clear that 
the minimization can be done over the diagonal matrix diag(Ai, A2) and one has 

I - I Ai — Ai A2 - A2 I t -t \ f A? Ao 1 { -\ \ 

A3 = n argmm < — — 1 — > + op(l) = nmm — , — + op(l). 

AiA 2 =0 [ Vi v 2 J \vi v 2 J 

Accordingly, by Proposition and \TT\ the three tests can be summarized by 

nA 2 compared to t^Xij 
A 2 

n^ 2 - compared to v|, 

n min ( , ^ ] compared to x\ j 
V vi v 2 J 

where Vk = var(Ajt j i). Assume there is less variance on the estimate of the smallest eigenvalue, 

A 2 A 2 . . ~ ^ 

i.e. vi > v 2 such that ^ < this situation may arise when Ai and A2 have similar values 

A 2 A 2 
but different variances. Then to conduct the test, the statistic ^ is a better choice than 

As a consequence, unlike Ai and A2, the statistic A3 appears as a coherent choice because its 
associated minimization takes into account the variance of the estimation. 



4 Application to sufficient dimension reduction 

We focus on a particularly famous method in SDR called sliced inverse regression (SIR) which 
has been introduced in [21] to deal with the regression model 

Y = f(PX,e) (21) 

where e X X E W, Y E R, and P is a projector on the vector space E with dimension do < p, 
called the central subspace. The objective is to estimate E. If X is elliptically distributed, 
then we have that S _1 (E[(X — E[X])?/;(y)] E E with S = var(X), for any measurable function 
tp. Accordingly, in order to recover the whole central subspace one needs to consider many 
functions ip. For a given family of functions (iph)i<h<H we define ^ = {ip\{Y) : iPh(Y)) t ■ 
Under some additional conditions [22], the image of the matrix £ -1 / 2 cov(X, ^f(Y)) is equal to 
Y^I 2 E. Then one can make the svd of an estimator of this matrix to obtain do vectors that 
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form an estimated basis of Y>I 2 E. Motivated by the curse of dimensionality, the estimation of 
do is one of the most crucial points in SDR. To make that possible, a popular way consists in 
estimating the rank of E -1 / 2 cov(X, \t) using the hypothesis testing framework given by ([!]) (see 
for example [21], [8] and [10] ). Since we are interested in estimating the rank, we prefer to deal 
directly with cov(X, to avoid the introduction of an additional noise due to the estimation 
of the matrix E. Assume that ((Xi,Yi), - • • ,(X n ,Y n )) is a i.i.d. sequence from model (|2"Tj) . 
denote by P its associated empirical c.d.f. and define the quantity 

C = E[K], with K = {X - E[X\)(9(Y) - E[^(Y)]) T , 

associated with its empirical estimator 

C = K, with Kt = (Xi - X)(^i - *) T , and = *(YJ. 

We apply the CS bootstrap to calculate the quantiles of each statistic. Facing p9JI and (|20p . we 
use an independent weighted bootstrap to reproduce the asymptotic law of y/n(C — C), that is 
we define the bootstrap matrix 

C* = C c + K*, with K* = Wi (Ki - K) (22) 

where C c stands for the solution of an optimization problem depending on the selected statistic 
Ai, A2 or A3 (see Section [3] for the details) and (wi) is a sequence of i.i.d. random variables. 
We also define 

1 n 

V = var(vec(K)) and V* = - V vec(K* - K*) vec(K* - K*) T . 

n z — ' 

i=i 

To apply propositions [TJ [91 and [TT\ we need the following result which is of particular interest 
since it provides a new bootstrap procedure for SIR that is different than the one proposed in 

m- 

Proposition 12. Assume that E[||A|| 2 ] < +00, E[||^(y)|| 2 ] and K[\\K\\j?] are finites, if more- 
over (wi) is a i.i.d. sequence of real random variables with mean and variance 1, then we 
have 

£oo(n 1/2 K*\P) = Coo(n 1/2 (d - C)) a.s. and V* 4 V conditionally a.s.. 
Remark 5. Taking a partition {1(h), h = 1, . . . , H} of the range of Y we recover the orig- 

— 1/2 

inal SIR method with the family formed by the p h tryel(h)} ,s with ph = F(Y G I{h)). 
Then Csir = E-^cov^l)^ )- 1 / 2 with 1 = (l { y iG/(1)} , . . . , l { y iG/(H)} ) T and D = diag(p h ), 
is estimated by d S m = S -1 / 2 (X — 'X)\ T D^ 1 / 2 with D = diag(p ft ), p h = t {Yeim , E = 
(X — X) (X — X) T ' . We have the expansion 

n- 1/2 (C S iR - Csir) = n -i/2 s -i/2 ((x _ e[X ])it _ cov (x, \))D- 1 ' 2 

- E~ 1 /2 n -i/ 2( gi/2 _ S i/ 2)CsiR _ Csmn- 1 / 2 ^/ 2 - D¥>)D-^ + op(1). 

As a consequence, the matrix E -1 / 2 and the weights p^s are playing an important role on 
the asymptotic of the matrix SIR. They introduce some other terms in the asymptotic dis- 
tribution and clearly the simple bootstrap presented before does not work for SIR as it was 
originally defined. Even if we believe that a more evolved weighted bootstrap works to boot- 
strap \/n(Csm — Csir), we emphasize that it may be less accurate than the one we propose 
since it complicates the asymptotic without being necessary for testing the rank. 
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Recall that m is a non-negative integer, for k G {1, 2, 3} and B £ N* we calculate independent 
copies A* k x , A* k B with the CS bootstrap algorithm corresponding to each statistic. Then we 
estimate the quantile with 

1 B 

q* k (a) = ™[{FZ(t) >a} = K,(lBa])> where F k(t) = ^E 1 {A|, i> <*}» 

6=1 

[•] is the integer ceiling function and A£ stands for the rank statistic associated to the sample 
A£ 1 . . . A£ B . On the first hand, we conduct the test described by Q using the CS bootstrap, 
i.e. 

Hq is rejected if A& > q~l(a). (23) 

On the other hand, the traditional test is conducted by comparing the statistic A2 and A3 to 
the quantile of their asymptotic law respectively given by propositions [HI and 1101 For Ai, in 
general the limit in law is quite complicated^! (see Proposition [6|), so that we use approximations: 
the Wood's approximation (see [Mj) as it is computed in the R software, an adjusted version 

Ai,adj. = Ai/ a A xh with a = ELi^VELi b = (Efc=i^) 2 /Efc=i ^1, and a re-scaled 
version Ai )SC = A\/c -4 x 2 s i c = U (see [3] for these two corrections). 

In all the simulations we compute the matrix C by taking ^>{t) = (^-{ y ei(i),...,y&i(H)}) where 
the I(/t)'s form an equi-partition of the range of the data Y±, . . . ,Y n . In the whole study we 
put (p,H) = (6, 5), B = 1000 and we consider n = 50,100,200,500. Although the parameter 
H does not really affect the SIR method, we choose it globally good with respect to all the 
situations. 

The first model we study is the following standard model: 

Model I: Y = X x + .le with e X X, X = J\f(0, 1), e = M{0,l). 

In order to highlight guidelines (fA]) and (jB]) , we produce in figure [1] two graphics each repre- 
senting situation under Hi and Hq for the statistic A3. Similar graphics dealing with A2 have 
been drawn but are not presented here. On the first one we see that even if the sample is under 
Hi the bootstrap distribution reflects Hq. As a consequence, guideline is satisfied and the 
power of the bootstrap test is going to 1. The second graph shows that the statistic distribution 
is closer to the bootstrap distribution than its asymptotic distribution. This has no reason to 
occur when the statistic is not pivotal (see the introduction and [15] for the details). As a 
consequence, we believe that this good fitting is due to Guideline |Bj 

In figure [2] we analyse the asymptotic distribution of q(ct) in model I for each statistic. To 
measure the error we consider the behaviour of 

Fn(q(a)), 

which is optimally equal to 1 — a. To make that possible, F n is estimated with a large sample 
size so that the estimation error is negligible. Then we run over 100 samples the CS bootstrap to 
provide, for each sample, a bootstrap estimation of the quantile q(a). The associated boxplot 
for n = 100, 200, 500 are provided in Figure [21 As a consequence, we may notice that the 
behaviour of A2 and A3 are quite similar facing the one of Ai. Even if every boxplot argues 

9 When the predictors are normally distributed, it has been shown that Ai is asymptotically chi-squared 
distributed (see [8]). The authors also pointed out that it was less robust than the weighted chi-squared asymptotic 
as soon as the predictors distribution deviates from normality. As a result, we keep in the nonparametric 
framework by avoiding such asymptotic in this simulation study. 
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Plot of the distributions under H for A 3 



Plot of the distributions under H, for A 3 



x the true 

— the bootstrap 

— the asymptotic 




x the true 

— the bootstrap 

— the asymptotic 



x>o<xxxX' 
200 



500 600 



Figure 1: Plot of the asymptotic distribution, and the estimated distribution of the statistic 
and the bootstrap statistic for A3 in the case of Model I. 



B 



n=100 



n i r 
n=200 



D At 
I A 2 

D a 3 



n=500 



Figure 2: Bowplot over 100 samples of q(a) for Ai, A2, A3 and a = 0.95 in the case of Model I 
for different values of n . 
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n m 


A 

Ai 

Wood Resc. Adj. 


A 2 

CB Ai A 2 CB A 2 


A 3 

A 3 CB A 3 


50 J 


0.9988 0.9998 0.9988 
0.0326 0.0590 0.0336 


0.9988 
0.0494 


1.0000 
0.3466 


1.0000 
0.0744 


1.0000 
0.3098 


1.0000 
0.07 


100 ° 


1.0000 1.0000 1.0000 
0.0386 0.052 0.0388 


1.0000 
0.0456 


1.0000 
0.1494 


1.0000 
0.0676 


i nnnn 
0.1466 


i nnnn 
0.0722 


200 ° 


1.0000 1.0000 1.0000 
0.0474 0.055 0.0476 


1.0000 
0.0514 


1.0000 
0.096 


1.0000 
0.0646 


1.0000 
0.0954 


1.0000 
0.0664 


500 ° 


1.0000 1.0000 1.0000 
0.0492 0.0514 0.0494 


1.0000 
0.0516 


1.0000 
0.0656 


1.0000 
0.0584 


1.0000 
0.0654 


1.0000 
0.0584 


Table 2: Estimated levels and power in Model I for 


a = 5%. 




n m 


Ai 

Wood Resc. Adj. 


A 2 

CB Ai A 2 CB A 2 


A 3 

A 3 CB A 3 


50 J 


0.9646 0.9928 0.9656 
0.0318 0.0628 0.0324 


0.9682 
0.0496 


1.0000 
0.3412 


1.0000 
0.0588 


1.0000 
0.3042 


1.0000 
0.0628 


100 ° 


0.9996 1.0000 0.9996 
0.0336 0.0486 0.0344 


0.9996 
0.0412 


1.0000 
0.1516 


1.0000 
0.0696 


1.0000 
0.1432 


1.0000 
0.0718 


200 ° 


1.0000 1.0000 1.0000 
0.0378 0.0486 0.038 


1.0000 
0.0424 


1.0000 
0.0844 


1.0000 
0.0602 


1.0000 
0.0832 


1.0000 
0.0604 


500 ° 


1.0000 1.0000 1.0000 
0.0454 0.0502 0.0458 


1.0000 
0.0474 


1.0000 
0.0638 


1.0000 
0.0606 


1.0000 
0.0634 


1.0000 
0.0608 



Table 3: Estimated levels and power in Model la for a = 5%. 



for convergence to 1 — a, testing with Ai seems a better choice when n is small because of a 
quasi immediate convergence of the bias. When n increase, this is no longer evident because 
the variance of either A2 orAg is smaller. 

Furthermore, we go into details in Table [2] by running Model I over 5000 samples. For each 
of them and every statistic, we conduct the bootstrap test (|23p and its traditional version. The 
table presents for each m < do, the proportion of rejected tests. This corresponds to either 
estimate of the power or estimate of the level. 

Although it has not the best power, the clear winner is the tests based on Ai. Inside this 
group, for any sample number, the bootstrap and the rescaled version are the closest to the 
nominal level. Concerning A2 and A3 the result are quite impressive when n is small: for 
n = 100, whereas traditional testing makes a type I error 30% of the time, the bootstrap testing 
goes wrong around 7%. This confirms observation on the second graph of Figure [TJ 



n 


m 




A 


1 




A 2 


A 3 


Wood 


Resc. 


Adj. 


CB Ai 


A 2 


CB A 2 


A 3 


CB A 3 


50 





1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1 


0.034 


0.1072 


0.034 


0.0378 


0.2122 


0.0396 


0.1394 


0.015 


100 





1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1 


0.037 


0.0904 


0.0374 


0.0404 


0.0986 


0.0572 


0.0614 


0.0284 


200 





1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1 


0.0484 


0.096 


0.0488 


0.0518 


0.0708 


0.066 


0.056 


0.0506 


500 





1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1 


0.0486 


0.0912 


0.0486 


0.0490 


0.0598 


0.0664 


0.0612 


0.0674 



Table 4: Estimated levels and power in Model lb for a = 5%. 
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n to 


Wood Resc. Adj. 


A 2 

CB Ai A 2 CB A 2 


A 

A3 

A 3 


CB A 3 


50 J 


0.9308 0.9884 0.9428 
0.0036 0.0148 0.0050 


0.9448 
0.0086 


1.0000 
0.1816 


0.0148 


0.9988 1.0000 
0.1404 


0.9988 
0.0130 


100 ^ 


1.0000 1.0000 1.0000 
0.0072 0.0122 0.0082 


1.0000 
0.0096 


1.0000 
0.0536 


1.0000 
0.02 


1.0000 
0.0496 


1.0000 
0.021 


200 


1.0000 1.0000 1.0000 
0.0076 0.0114 0.0086 


1.0000 
0.0102 


1.0000 
0.0252 


1.0000 
0.0192 


1.0000 
0.0248 


1.0000 
0.02 


500 


1.0000 1.0000 1.0000 
0.0068 0.0076 0.007 


1.0000 
0.0082 


1.0000 
0.012 


1.0000 
0.011 


1.0000 
0.012 


1.0000 
0.011 



Table 5: Estimated levels and power in Model II for a = 1%. 



In Table [3] and Table [H we consider the same model than Model I excepted that we change 
the distribution of the predictors: in Model la, X has independent coordinates with a student 

distribution with 5 degrees of freedom, in Model lb, X = XX x t + X 2 (l - e) with e = 0(1/2), 

X 1 = M{(6, 0, • • • , 0), I), X 2 = M(0, 1). For this two models, we have similar conclusions than 
model I with two new things. First, the rescaled version is not robust to the distribution of 
the predictors (Tabled]). Second, the algorithm employed to optimized A3 could failed at very 
small sample size. 

We introduce a non linear relationship by considering the model 

Model II: Y = tanh(Xi) + .le with e X X, X=M(0,I), e = Af(0,l). 

In Table El we present similar results as in tables [3][5] with the difference that the nominal level 
is a = 1% in order to highlight differences in the power of each test. Again, the CS bootstrap 
induces a large improvement of the accuracy of the test with A2 and A3. At n = 50, the test 
based on Ai is less powerful than the others but it is more accurate under Hq. The winner 
remains the CS bootstrap with Ai. A new important things is that at n = 500, it seems better 
to use the CS bootstrap with A2 and A3. Actually this is due to the variance of the formers 
which is smaller than the variance of A^ as it was already highlighted in Figure [21 

We conclude by increasing difficulty considering the following model, introduced in |21j . 

Model III: Y = — - — — j + e e X X, X = JV(0, 1) 

We still present in Table E] the estimated level and power with the nominal level a = 2% for 
each test. For such a model the conclusions are quite mitigated because it induces a trade-off 
between high power and accurate level. Indeed when n is small, the better powers are provided 
by the traditional tests with A2 and A3. Nevertheless the more accurate levels can be found 
looking at the CS bootstrap with A2 (n = 100) or Ai (n = 200). Moreover the tests associated 
to Ai without bootstrap are the worst concerning this model. Accordingly, the simulation study 
highlighted the good behaviour of the CS bootstrap: in every model it improves the accuracy of 
the traditional test for each statistic. One may remember that the bias of the CS bootstrap with 
Ai has the faster rate of convergence with respect to the CS bootstrap of A2 or A3. Otherwise, 
the variance of A \ may be greater than the variance of A?i or A|. Finally, for the simple models 
it seems better to use the CS bootstrap with the statistic Ai. 

5 Concluding remarks 

Along this study, we found that the main advantages of the CS bootstrap are: 
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n 


m 




A 


l 




A 2 


A 3 


Wood 


Resc. 


Adj. 


f \ I ) A 

CB Ai 


A 2 


CB A 2 


A 3 


--- 

CB A 3 







0.9950 


0.9992 


0.9962 


0.9960 


1.0000 


0.9966 


1.0000 


0.9966 


50 


1 


0.3750 


0.5342 


0.3990 


0.4676 


0.9074 


0.5066 


0.8344 


0.3270 




2 


0.0078 


0.0156 


0.0086 


0.0240 


0.0620 


0.0164 


0.0344 


0.0136 







1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


100 


1 


0.9330 


0.9556 


0.9368 


0.9446 


0.9952 


0.9842 


0.9934 


0.9806 




2 


0.0134 


0.0176 


0.0138 


0.0210 


0.0306 


0.0228 


0.0266 


0.0278 







1.000 


1.0000 


1.000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


200 


1 


1.000 


1.0000 


1.000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 




2 


0.0154 


0.0182 


0.0158 


0.0198 


0.025 


0.024 


0.0244 


0.026 







1.0000 


1.000 


1.0000 


1.0000 


1.0000 


1.000 


1.0000 


1.0000 


500 


1 


1.0000 


1.000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 


1.0000 




2 


0.0184 


0.0194 


0.0184 


0.02 


0.0228 


0.0228 


0.0228 


0.023 



Table 6: Estimated levels and power in Model n for a = 2%. 



1. Alternative to the asymptotic comparison. This argument is even stronger since the 
asymptotic law can be unknown (or difficult to estimate) or the asymptotic law remains 
too much different from the statistic law (e.g. large matrix inversion). 

2. By Theorem 4, which provides its consistency, the CS bootstrap works under mild assump- 
tions. Essentially, we ask the manifold to be locally smooth, and we require a bootstrap 
18 of the unconstrained estimator. 

3. The CS bootstrap is computationally as simple than the considered statistic. 

4. In the case of rank testing, the CS bootstrap clearly improves the accuracy of traditional 
testing (cf. the simulation study). 

Besides, there exists some natural extensions of the previous work. First although it is 
suitable for testing, the form of the objective function bQ is quiet restrictive. For example, we 
believe that the CS bootstrap could be extended to M and Z estimation. Secondly, conditions 
that guarantee 

q{a) = q n (a) + o F (n~ 1/2 ) 

have not been provided yet. This would valid theoretically the use of the CS bootstrap with 
respect to traditional testing. 

Appendix 
Proof of Lemma Q] 

The whole proof is made conditionally on the sample. By definition of 8 C , with high probability, 
A* is full rank for n large enough, we have 

||4*i/2(0* _ c) || < ||4*l/2(0* _ + ||4*V2(0* _ c) || < 2 \\A* 1 / 2 (0* O - C )\\. (24) 

Then since 9q — 6 C — > 0, 9 C — > 8 C and because A* — > A is full rank, one gets that 9* 6 C . 
Therefore, since 9 C is .M-nonsingular and reffering to Definition [H we get 

argmin ||r* 1/2 (^ - 0)|| = argmin ||r* 1/2 (^ - 0)||, 
eeM g{8)=o 
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with g continuously differentiable on 9 C and J g (9 c ) full rank. By assumption on g, 9*, at least 
for n large enough, satisfies the first order conditions, that are 

r A*(e* Q -e*)-jT(e*)\* n = o 

I 9(0*) = 

where A* is the Lagrange multiplier. Using a Taylor expansion of g around 9 C , we get g(9*) = 
g{0c) + Jj(9c)(9c ~ 0c) + op(||#* — 9 C \\), and with the previous equations we have 

/ A* Jj(9*)\ (9% -0 C \ = ( A*(9* - 9 C ) \ 
\J a (9 c ) J\ K J ^(||^-?c[IV' 

Now by Slutsky's lemma, we get 

( A Jj(9 c )\ (n^(9* - 9 C )\ _ 1/2 (A(9* - 9 C )\ 

and the conclusion follows by multiplying on the left by the matrix 

(A-i - PA-\ A-\jJ(9 c )(J 9 (9 c )A- 1 Jj(0c))- 1 ) 
with P = A- 1 jJ(9 c )(J g (9 c )A- 1 jJ(9 c ))- 1 J g (9 c ). 

□ 

Proof of Theorem |4] 

The proof is divided in two parts each corresponding to the level and the power of the test. 
Assume Hq and define F n and Fqo respectively as the c.d.f. of A and the weak limit of F n . Note 
that we can apply Proposition [2] to get 

» 1/, (|:*)=" 1/, G-%)< ? -« +< -w- 

and Theorem [3] to get conditionally a.s. 

" 1/2 ft:|)=» 1/2 G_ / p)w-«+*w- 

with P detailed in the statement of Proposition [2j Using (fTTjh (fT8|) and Slutsky's theorem we 
have 

£ 0O (A*|P)=/: 0O (A) a.s.. 

In other words, with probability 1, F converges pointwise to F^. As in [23J chapter 23, Lemma 3, 
consider A the set of discontinuity of F^ . For every a e]0, 1[\A, we have q(a) — > q(a) a.s. (see 
for instance [23], chapter 21). Using Slutsky's theorem, we get £oo(A — q(a)) = £oo(A — q(a)), 
accordingly 

P(A < q(a)) — > Fooiqia)) for all a e]0, 1[\A. 

Because Foo is continuous F oc (q(a)) = a. Since Foo is non-decreasing, A is denumerable, since 
a i — y P(A < q(ct)) is non-decreasing with continuous limit, the convergence is uniform and so 
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holds for every a £]0, 1[. This concludes the proof for the level. It remains to show that the 
power of the test goes to 1. Assume H\ and let a g]0, 1[, the statistic A goes to infinity in 
probability and it suffices to show that with probability 1 the bootstrap quantile q{a) remains 
bounded. This means exactly that conditionally a.s. the sequence A* is tight. Note that 
conditionally a.s. we have 

A* < n\\A* x f 2 (0 o -6%)f = A*, 
where A* converges in distribution by (|17|) . and is therefore tight. □ 

Proof of Proposition \5\ 

We have 

Ai = ||n 1/2 QiMQ 2 ||F = ||n 1/2 vec(QiMQ 2 )|| 2 . (25) 

By the Delta method and because Hq is realized, we can apply convergence results about 
eigenprojectors to both matrices M T M and MM T to obtain the -^/re-convergence for Q\ and 
Q 2 - Then we write 

n 1/2 QiMQ 2 = n l l 2 Q x (M - M)Q 2 + n 1 ' 2 ^ - Qi)M(Q 2 - Q 2 ) 
= n^Q^M - M)Q 2 + Op^- 1 / 2 ), 

which suffices to obtained the first statement of the theorem. For the second statement, the 
symmetric matrix (Q 2 ® Qi)r(Q 2 ® Q\) is estimated consistently by (Q 2 ® Qi)r(Q 2 ® Qi) and 
so are its eigenvalues. □ 

Proof of Proposition [8] 

We can notice that \fnQ\MQ 2 has the same asymptotic law than \JuQ\{M — M)Q 2 whose 
asymptotic variance is consistently estimated by [(Q 2 <g> Qi)T(Q 2 eg) Qi)] + (see the proof of 
Proposition [6]). □ 

Proof of Proposition [T2l 

Recall that K { = (X { - - W), K* = Wi (Ki - K) and define K { = (X { - E[X])(*; - 

E[\I/]). First note that, by Slutsky's theorem, ^fn K* has the same asymptotic law than 
n -i/2 Y^" =1 Wi(K{ — ~E[K]). Then we can develop 



n 

-1/2 



n 

i=l 



^^(/Q-E^]) 

i=l 

n n 

V 2 Wi((Xi - E[X])(^ - *) T - E[K]) + (E[A] - X)n-V2 ^ _ 
i=i i=i 

n n 

x l 2 £ MK t - E[K]) + n-V2 ^ ^(A, - E[A])(E[*] - *) T 

n 

+ (E[X] - X)n- X l 2 ]T ~ W- 



n 

i=i i=i 



i=i 
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Checking a Lindeberg condition as bellow to ensure the weak convergence of n x / 2 J^™ = i w i{Xi - 
ELY]) and n" 1 / 2 £"=i Wi(*< - ^) T , and using the Slutsky's theorem we get conditionally a.s. 



n i/2 ^* _ n -i/2 ^ Wi (Ki - E[K\) + P (n" 1 



/2> 



i=l 



We can apply the multidimensional version of the Lindeberg's central limit theorem (see for 
instance [5], Corollary 18.2), provided that 



1 n 

-E e oi^ 1/2 ^ii 2 i { ||^ 1 



where ^ = vec(iQ — E[if]) and V = ^X^=i(£* ~~ ~~ £) T - The above convergence is a 
consequence of the Lebesgue domination theorem which ensure that each term of the sum 
goes to 0, afterwards we can conclude by the Cesaro's Lemma. Thus we have proved that 
conditionally a.s. 



■n 



and it remains to note that V ^ V the variance of the limit in law of y/n{C — C) provided that 
K has a finite order 2 moment. For the second convergence, we note that conditionally a.s. 



1 n 

F *_y = -^(^_i)^ + 0p (i) 



i=l 

then by noting Vi a coordinate of we calculate 

2" 



E 



n 



i=l 



?? 



2 e[(«,? - 1) 2 ] £\ 



i=l 



which goes to a.s. provided that K has a finite order 4 moment. We conclude by using the 

p ^ 

Markov inequality to get that V* — > V conditionally a.s.. □ 
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